Wiki-LDA: A Mixed-Method Approach for Effective Interest Mining on Twitter Data

نویسندگان

  • Xiao Pu
  • Mohamed Amine Chatti
  • Hendrik Thüs
  • Ulrik Schroeder
چکیده

Learning analytics (LA) and Educational data mining (EDM) have emerged as promising technology-enhanced learning (TEL) research areas in recent years. Both areas deal with the development of methods that harness educational data sets to support the learning process. A key area of application for LA and EDM is learner modelling. Learner modelling enables to achieve adaptive and personalized learning environments, which are able to take into account the heterogeneous needs of learners and provide them with tailored learning experience suited for their unique needs. As learning is increasingly happening in open and distributed environments beyond the classroom and access to information in these environments is mostly interest-driven, learner interests need to constitute an important learner feature to be modeled. In this paper, we focus on the interest dimension of a learner model and present Wiki-LDA as a novel method to effectively mine user’s interests in Twitter. We apply a mixed-method approach that combines Latent Dirichlet Allocation (LDA), text mining APIs, and wikipedia categories. Wiki-LDA has proven effective at the task of interest mining and classification on Twitter data, outperforming standard LDA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Knowledge Management Approach to Discovering Influential Users in Social Media

A key step for success of marketer is to discover influential users who diffuse information and their followers have interest to this information and increase to diffuse information on social media. They can reduce the cost of advertising, increase sales and maximize diffusion of information.  A key problem is how to precisely identify the most influential users on social networks. In this pape...

متن کامل

A High-Performance Model based on Ensembles for Twitter Sentiment Classification

Background and Objectives: Twitter Sentiment Classification is one of the most popular fields in information retrieval and text mining. Millions of people of the world intensity use social networks like Twitter. It supports users to publish tweets to tell what they are thinking about topics. There are numerous web sites built on the Internet presenting Twitter. The user can enter a sentiment ta...

متن کامل

2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework

Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...

متن کامل

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

Patterns Prediction of Chemotherapy Sensitivity in Cancer Cell lines Using FTIR Spectrum, Neural Network and Principal Components Analysis

    Drug resistance enables cancer cells to break away from cytotoxic effect of anticancer drugs. Identification of resistant phenotype is very important because it can lead to effective treatment plan. There is an interest in developing classifying models of resistance phenotype based on the multivariate data. We have investigated a vibrational spectroscopic approach in order to characterize a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016